Search CORE

24 research outputs found

Automatic Summarization for Student Reflective Responses

Author: Luo Wencan
Publication venue
Publication date: 27/09/2017
Field of study

Educational research has demonstrated that asking students to respond to reflection prompts can improve both teaching and learning. However, summarizing student responses to these prompts is an onerous task for humans and poses challenges for existing summarization methods. From the input perspective, there are three challenges. First, there is a lexical variety problem due to the fact that different students tend to use different expressions. Second, there is a length variety problem that student inputs range from single words to multiple sentences. Third, there is a redundancy issue since some content among student responses are not useful. From the output perspective, there are two additional challenges. First, the human summaries consist of a list of important phrases instead of sentences. Second, from an instructor's perspective, the number of students who have a particular problem or are interested in a particular topic is valuable. The goal of this research is to enhance student response summarization at multiple levels of granularity. At the sentence level, we propose a novel summarization algorithm by extending traditional ILP-based framework with a low-rank matrix approximation to address the challenge of lexical variety. At the phrase level, we propose a phrase summarization framework by a combination of phrase extraction, phrase clustering, and phrase ranking. Experimental results show the effectiveness on multiple student response data sets. Also at the phrase level, we propose a quantitative phrase summarization algorithm in order to estimate the number of students who semantically mention the phrases in a summary. We first introduce a new phrase-based highlighting scheme for automatic summarization. It highlights the phrases in the human summaries and also the corresponding semantically-equivalent phrases in student responses. Enabled by the highlighting scheme, we improve the previous phrase-based summarization framework by developing a supervised candidate phrase extraction, learning to estimate the phrase similarities, and experimenting with different clustering algorithms to group phrases into clusters. Experimental results show that our proposed methods not only yield better summarization performance evaluated using ROUGE, but also produce summaries that capture the pressing student needs

D-Scholarship@Pitt

A Novel ILP Framework for Summarizing Content with High Lexical Variety

Author: Almeida
Celikyilmaz
Conroy
DIANE LITMAN
FEI LIU
Goodfellow
Li
Li
Luo
Luo
Luo
Martins
Mazumder
Mosteller
Narayan
Qian
Ren
Tarnpradab
Wang
WENCAN LUO
Wilson
Xiong
ZITAO LIU
Publication venue
Publication date: 25/07/2018
Field of study

Summarizing content contributed by individuals can be challenging, because people make different lexical choices even when describing the same events. However, there remains a significant need to summarize such content. Examples include the student responses to post-class reflective questions, product reviews, and news articles published by different news agencies related to the same events. High lexical diversity of these documents hinders the system's ability to effectively identify salient content and reduce summary redundancy. In this paper, we overcome this issue by introducing an integer linear programming-based summarization framework. It incorporates a low-rank approximation to the sentence-word co-occurrence matrix to intrinsically group semantically-similar lexical items. We conduct extensive experiments on datasets of student responses, product reviews, and news documents. Our approach compares favorably to a number of extractive baselines as well as a neural abstractive summarization system. The paper finally sheds light on when and why the proposed framework is effective at summarizing content with high lexical variety.Comment: Accepted for publication in the journal of Natural Language Engineering, 201

arXiv.org e-Print Archive

Crossref

University of Central Florida (UCF): STARS (Showcase of Text, Archives, Research & Scholarship)

Neural Chinese Word Segmentation with Lexicon and Unlabeled Data via Posterior Regularization

Author: Cai Deng
Chen Xinchi
Chen Xinchi
Dauphin Yann
Ganchev Kuzman
Lafferty John D.
Levow Gina-Anne
Liu Junxin
Liu Yijia
Low Jin Kiat
Luo Wencan
Pei Wenzhe
Sun Weiwei
Xu Jingjing
Xue Nianwen
Yang Jie
Zhang Jiacheng
Zhang Meishan
Zhang Qi
Zhang Yanna
Zhao Hai
Zhao Hai
Zhao Lujun
Zheng Xiaoqing
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 26/04/2019
Field of study

Existing methods for CWS usually rely on a large number of labeled sentences to train word segmentation models, which are expensive and time-consuming to annotate. Luckily, the unlabeled data is usually easy to collect and many high-quality Chinese lexicons are off-the-shelf, both of which can provide useful information for CWS. In this paper, we propose a neural approach for Chinese word segmentation which can exploit both lexicon and unlabeled data. Our approach is based on a variant of posterior regularization algorithm, and the unlabeled data and lexicon are incorporated into model training as indirect supervision by regularizing the prediction space of CWS models. Extensive experiments on multiple benchmark datasets in both in-domain and cross-domain scenarios validate the effectiveness of our approach.Comment: 7 pages, 11 figures, accepted by the 2019 World Wide Web Conference (WWW '19

arXiv.org e-Print Archive

Crossref

An Improved Phrase-Based Approach To Annotating And Summarizing Student Course Responses

Author: Litman Diane
Liu Fei
Luo Wencan
Publication venue: 'Information Bulletin on Variable Stars (IBVS)'
Publication date: 01/01/2016
Field of study

Teaching large classes remains a great challenge, primarily because it is difficult to attend to all the student needs in a timely manner. Automatic text summarization systems can be leveraged to summarize the student feedback, submitted immediately after each lecture, but it is left to be discovered what makes a good summary for student responses. In this work we explore a new methodology that effectively extracts summary phrases from the student responses. Each phrase is tagged with the number of students who raise the issue. The phrases are evaluated along two dimensions: with respect to text content, they should be informative and well-formed, measured by the ROUGE metric; additionally, they shall attend to the most pressing student needs, measured by a newly proposed metric. This work is enabled by a phrase-based annotation and highlighting scheme, which is new to the summarization task. The phrase-based framework allows us to summarize the student responses into a set of bullet points and present to the instructor promptly

University of Central Florida (UCF): STARS (Showcase of Text, Archives, Research & Scholarship)

A Novel Ilp Framework For Summarizing Content With High Lexical Variety

Author: Litman Diane
Liu Fei
Liu Zitao
Luo Wencan
Publication venue: 'Information Bulletin on Variable Stars (IBVS)'
Publication date: 01/11/2018
Field of study

Summarizing content contributed by individuals can be challenging, because people make different lexical choices even when describing the same events. However, there remains a significant need to summarize such content. Examples include the student responses to post-class reflective questions, product reviews, and news articles published by different news agencies related to the same events. High lexical diversity of these documents hinders the system\u27s ability to effectively identify salient content and reduce summary redundancy. In this paper, we overcome this issue by introducing an integer linear programming-based summarization framework. It incorporates a low-rank approximation to the sentence-word cooccurrence matrix to intrinsically group semantically similar lexical items. We conduct extensive experiments on datasets of student responses, product reviews, and news documents. Our approach compares favorably to a number of extractive baselines as well as a neural abstractive summarization system. The paper finally sheds light on when and why the proposed framework is effective at summarizing content with high lexical variety

University of Central Florida (UCF): STARS (Showcase of Text, Archives, Research & Scholarship)